Search CORE

5 research outputs found

Classification interactive multi-label pour l’aide à l’organisation personnalisée des données

Author: Nair-Benrekia Noureddine-Yassine
Publication venue: HAL CCSD
Publication date: 03/11/2015
Field of study

The growing importance given today to personalized contents led to the development of several interactive classification systems for various novel applications. Nevertheless, all these systems use a single-label item classification which greatly constrains the user’s expressiveness. The major problem common to all developers of an interactive multi-label system is: which multi-label classifier should we choose? Experimental evaluations of recent interactive learning systems are mainly subjective. The importance of their conclusions is consequently limited. To draw more general conclusions for guiding the selection of a suitable learning algorithm during the development of such a system, we extensively study the impact of the major interactivity constraints (learning from few examples in a limited time) on the classifier predictive and time-computation performances. The experiments demonstrate the potential of an ensemble learning approach Random Forest of Predictive Clustering Trees (RF-PCT). However, the strong constraint imposed by the interactivity on the computation time has led us to propose a new hybrid learning approach FMDI-RF+ which associates RF-PCT with an efficient matrix factorization approach for dimensionality reduction. The experimental results indicate that RF-FMDI+ is as accurate as RF-PCT in the predictions with a significant advantage to FMDI-RF + for the speed of computation.L’importance croissante donnée actuellement aux contenus personnalisés a conduit au développement de plusieurs systèmes de classification interactive pour diverses applications originales. Néanmoins, tous ces systèmes recourent à une classification mono-label des items qui limite fortement l’expressivité de l’utilisateur. Le problème majeur commun à tous les développeurs d’un système de classification interactif et multi-label est : quel classifieur multi-label devrions-nous choisir ? Les évaluations expérimentales des systèmes d’apprentissage interactifs récents sont essentiellement subjectives. L’importance de leurs conclusions est donc limitée. Pour tirer des conclusions plus générales qui permettent de guider la sélection de l’algorithme d’apprentissage approprié lors du développement d’un tel système, nous étudions de manière approfondie l’impact des contraintes d’interactivité majeures (apprentissage à partir de peu d’exemples en un temps limité) sur les performances prédictives et les temps de calcul des classifieurs. Les expérimentations mettent en évidence le potentiel d’une approche d’apprentissage ensemble Random Forest of Predictive Clustering Trees (RF-PCT). Cependant, la forte contrainte sur le temps de calcul posée par l’interactivité, nous a conduits à proposer une nouvelle approche d’apprentissage hybride FMDI-RF+ qui associe RF-PCT avec une approche de factorisation de matrice efficace pour la réduction de dimensions. Les résultats expérimentaux indiquent que FMDI-RF+ est aussi précise que RF-PCT dans les prédictions avec clairement un avantage à FMDI-RF+ pour la vitesse de calcul

Classification interactive multi-label pour l’aide à l’organisation personnalisée des données

Author: Nair-Benrekia Noureddine-Yassine
Publication venue: HAL CCSD
Publication date: 03/11/2015
Field of study

Thèses en Ligne

Selecting a multi-label classification method for an interactive system

Author: Kuntz Pascale
Meyer Frank
Nair-Benrekia Noureddine-Yassine
Publication venue: HAL CCSD
Publication date: 02/06/2014
Field of study

International audienceInteractive classification-based systems engage users to coach learning algorithms to take into account their own individual preferences. However most of the recent interactive systems limit the users to a single-label classification, which may be not expressive enough in some organization tasks such as film classification, where a multi-label scheme is required. The objective of this paper is to compare the behaviors of 12 multi-label classification methods in an interactive framework where "good" predictions must be produced in a very short time from a very small set of multi-label training examples. Experimentations highlight important performance differences for 4 complementary evaluation measures (Log-Loss, Ranking-Loss, Learning and Prediction Times). The best results are obtained for Multi-label k Nearest Neighbours (ML-kNN), Ensemble of Classifier Chains (ECC) and Ensemble of Binary Relevance (EBR)

Interactive multi-label classification for data personalization

Author: Kuntz Pascale
Meyer Frank
Nair-Benrekia Noureddine-Yassine
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceRecent interactive machine learning solutions have shown that embedding the end-user into the system might be the best way to fit his/her individual preferences. We here focus on a user-centered classification process which allows the user to interact with pre-computing results via a friendly visual interface. Recent experiments showed that interactive classification yields accurate results when classifiers are associated with a sufficient number of user-presented examples (Drucker et al., 2011). However, the efficient interactive learning solutions generally limit users to mono-labeling which may be not expressive enough in real-life situations; for instance, in some organization tasks, such as text labeling or multi-criteria recommendation where the user will naturally seek to handle multiple labels. In parallel, multi-label classification has received significant attention over the past few years (Madjarov et al., 2012). But, as far as we know, integrating multi-label approaches into an interactive learning framework still set open questions. In this communication, we propose a state-of-the-art of the multi-label classification algorithms capable of withstanding the interactivity constraints. We complete the presentation with first experimental results on literature datasets of various sizes (e.g. Music and IMDB)

Learning from multi-label data with interactivity constraints: an extensive experimental study

Author: Kuntz Pascale
Meyer Frank
Nair-Benrekia Noureddine-Yassine
Publication venue: 'Elsevier BV'
Publication date: 09/03/2015
Field of study

International audienceInteractive classification aims at introducing user preferences in the learning process to produce individualized outcomes more adapted to each user’s behaviour than the fully automatic approaches. The current interactive classification systems generally adopt a singlelabel classification paradigm that constrains items to span one label at a time and consequently limit the user’s expressiveness while he/she interacts with data that are inherently multi-label. Moreover, the experimental evaluations are mainly subjective and closely depend on the targeted use cases and the interface characteristics. This paper presents the first extensive study of the impact of the interactivity constraints on the performances of a large set of twelve well-established multi-label learning methods. We restrict ourselves to the evaluation of the classifier predictive and time-computation performances while the number of training examples regularly increases and we focus on the beginning of the classification task where few examples are available. The classifier performances are evaluated with an experimental protocol independent of any implementation environment on a set of twelve multi-label benchmarks of various sizes from different domains. Our comparison shows that four classifiers can be distinguished for the prediction quality: RF-PCT (Random Forest of Predictive Clustering Trees, Kocev (2012)), EBR (Ensemble of Binary Relevance, (Read et al., 2011)), CLR (Calibrated Label Ranking, Fürnkranz et al. (2008)) and MLkNN (Multi-label kNN, Zhang and Zhou (2007)) with an advantage for the first two ensemble classifiers. Moreover, only RF-PCT competes with the fastest classifiers and is therefore considered as the most promising classifier for an interactive multi-label learning system